Aggregate Data
   HOME

TheInfoList



OR:

Aggregate data is high-level
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
which is acquired by combining individual-level data. For instance, the output of an industry is an aggregate of the firms’ individual outputs within that industry. Aggregate data are applied in statistics,
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
s, and in economics. There is a distinction between aggregate data and individual data. Aggregate data refers to individual data that are averaged by geographic area, by year, by service agency, or by other means. Individual data are disaggregated individual results and are used to conduct analyses for estimation of subgroup differences. Aggregate data are mainly used by researchers and analysts, policymakers, banks and administrators for multiple reasons. They are used to evaluate policies, recognise trends and patterns of processes, gain relevant insights, and assess current measures for strategic planning. Aggregate data collected from various sources are used in different areas of studies such as comparative political analysis and APD scientific analysis for further analyses. Aggregate data are also used for medical and educational purposes. Aggregate data is widely used, but it also has some limitations, including drawing inaccurate
inferences Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word ''infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in ...
and false conclusions which is also termed ‘
ecological fallacy An ecological fallacy (also ecological ''inference'' fallacy or population fallacy) is a formal fallacy in the interpretation of statistical data that occurs when inferences about the nature of individuals are deduced from inferences about the gr ...
’. ‘Ecological fallacy’ means that it is invalid for users to draw conclusions on the ecological relationships between two quantitative variables at the individual level.


Applications

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, aggregate data are data combined from several measurements. When data is aggregated, groups of observations are replaced with
summary statistics In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount of information as simply as possible. Statisticians commonly try to describe the observations in * a measure of ...
based on those observations. In a
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business reporting, reporting and data analysis and is considered a core component of business intelligence. DWs are central Repos ...
, the use of aggregate data dramatically reduces the time to query large sets of data. Developers pre-summarise queries that are regularly used, such as Weekly Sales across several
dimensions In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordin ...
for example by item hierarchy or geographical hierarchy. In
economics Economics () is the social science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and intera ...
, aggregate data or data aggregates are high-level data that are composed from a multitude or combination of other more individual data, such as: *in
macroeconomics Macroeconomics (from the Greek prefix ''makro-'' meaning "large" + ''economics'') is a branch of economics dealing with performance, structure, behavior, and decision-making of an economy as a whole. For example, using interest rates, taxes, and ...
, data such as the overall
price level The general price level is a hypothetical measure of overall prices for some set of goods and services (the consumer basket), in an economy or monetary union during a given interval (generally one day), normalized relative to some base set ...
or overall
inflation rate In economics, inflation is an increase in the general price level of goods and services in an economy. When the general price level rises, each unit of currency buys fewer goods and services; consequently, inflation corresponds to a reductio ...
; and *in
microeconomics Microeconomics is a branch of mainstream economics that studies the behavior of individuals and firms in making decisions regarding the allocation of scarce resources and the interactions among these individuals and firms. Microeconomics fo ...
, data of an entire sector of an economy composed of many firms, or of all households in a city or region.


Major users


Researchers and analysts

Researchers use aggregate data to understand the prevalent
ethos Ethos ( or ) is a Greek word meaning "character" that is used to describe the guiding beliefs or ideals that characterize a community, nation, or ideology; and the balance between caution, and passion. The Greeks also used this word to refer to ...
, evaluate the essence of social realities and a social organisation, stipulate primary issues of concern in
research Research is "creativity, creative and systematic work undertaken to increase the stock of knowledge". It involves the collection, organization and analysis of evidence to increase understanding of a topic, characterized by a particular att ...
, and supply projections in relation to the nature of social issues. Aggregate data are useful for researchers when they are interested in investigating on the relationships between two distinct variables at the aggregate level, and the connections between an aggregate variable and a characteristic at the individual level. Researchers have also made an effort to evaluate policies, practices and precepts of systems critically with the assistance of aggregate data, to investigate the corresponding
relevance Relevance is the concept of one topic being connected to another topic in a way that makes it useful to consider the second topic when considering the first. The concept of relevance is studied in many different fields, including cognitive sci ...
and efficacy.


Policymakers

Aggregate data are used by governments to develop more effective policies because they serve as a measure of how capable a government is to be aware of the demands and needs of its citizens and a measure of the way a government maintains social order effectively. For example, governments around the world use of aggregate mobile location data for analysis in response to Covid-19. Aggregate mobile location data could provide insights about the effectiveness of
social distancing In public health, social distancing, also called physical distancing, (NB. Regula Venske is president of the PEN Centre Germany.) is a set of non-pharmaceutical interventions or measures intended to prevent the spread of a contagious dis ...
measures launched by governments. Governments also use aggregate data to identify possible “hot spots” and the potential for transmission. As well as projecting
effectiveness Effectiveness is the capability of producing a desired result or the ability to produce desired output. When something is deemed effective, it means it has an intended or expected outcome, or produces a deep, vivid impression. Etymology The ori ...
of government policies, aggregate data analyses are also taken to evaluate the nature, assess the extent, recognise the trend and study the pattern of a specific phenomenon or process with the aim to devise strategies, prepare short- or long-term policies, and take efficacious and relevant procedures for control or prevention. Policymakers also utilise financial aggregates data in evaluating companies and households’ economic and financial activities because these data help to identify risks associated with
financial stability Financial stability is a property of a financial system that dissipates financial imbalances that arise endogenously in the financial markets or as a result of significant adverse and unforeseeable circumstances. When stable, the system absorbs ...
. Policymakers can employ aggregate data to better understand the developments of a country’s economic and financial conditions.


Banks

Banks collect aggregated data from a significant number of customers and then anonymise the data through eliminating personal information. The main reason for banks to use aggregate data is to estimate
economic trend *all the economic indicators that are the subject of economic forecasting **see also: econometrics *general trends in the economy, see: economic history Economic history is the academic learning of economies or economic events of the past. R ...
s and gain insights on customer clusters. Banks are not permitted to share customers’ personal data, but aggregate data can be shared with banks’ business customers and can be accessed by other partners who also use the same platform to acquire information on aggregate data. In Australia, the Commonwealth Bank provides its business clients anonymised data related to their customers which are derived from card transactions. The ANZ also provides its business customers with anonymised data which is gathered from millions of merchant terminal transactions and ANZ card transactions. In the UK, the Integrated Urgent Care Aggregate Data Collection (IUC ADC) provides comprehensive information about IUC activity, its performance, as well as its service demand. Its data are sourced from the lead data providers responsible for offering integrated urgent care services in England. The
National Health Service The National Health Service (NHS) is the umbrella term for the publicly funded healthcare systems of the United Kingdom (UK). Since 1948, they have been funded out of general taxation. There are three systems which are referred to using the " ...
(NHS) under the
Department of Health and Social Care The Department of Health and Social Care (DHSC) is a department of His Majesty's Government responsible for government policy on health and adult social care matters in England, along with a few elements of the same matters which are not otherw ...
(DHSC) in England stated that this collection of aggregate data is going to replace the NHS 111 minimum dataset. It will also be used as a formal source for IUC statistics, as well as to oversee the Key Performance Indicators (KPIs) of the IUC ADC.


Administrators

National or regional level of available empirical data are used by administrators and intellectuals, as well as people who are concerned about a region or a society’s
welfare Welfare, or commonly social welfare, is a type of government support intended to ensure that members of a society can meet basic human needs such as food and shelter. Social security may either be synonymous with welfare, or refer specifical ...
, as sources of reference. In particular, administrators utilise aggregate data for assessments in current political, religious, social, or other atmosphere of a nation to track the gaps in social responses relating to time and space, and to dictate priorities for action. These assessments help administrators in evaluating current measures that are useful in future strategic planning and provide indicators about effective corrective measures.


Sources and collection methods

Aggregate data can be a composition of various types of writings and records, including
biography A biography, or simply bio, is a detailed description of a person's life. It involves more than just the basic facts like education, work, relationships, and death; it portrays a person's experience of these life events. Unlike a profile or ...
,
autobiography An autobiography, sometimes informally called an autobio, is a self-written account of one's own life. It is a form of biography. Definition The word "autobiography" was first used deprecatingly by William Taylor in 1797 in the English peri ...
, descriptive accounts and correspondence. For example, a researcher collects, collates, or compiles aggregate data through utilising multiple mechanisms of
social research Social research is a research conducted by social scientists following a systematic plan. Social research methodologies can be classified as quantitative and qualitative. * Quantitative designs approach social phenomena through quantifiable ...
, including inventory,
interview An interview is a structured conversation where one participant asks questions, and the other provides answers.Merriam Webster DictionaryInterview Dictionary definition, Retrieved February 16, 2016 In common parlance, the word "interview" ...
, an opinionnaire, and a
questionnaire A questionnaire is a research instrument that consists of a set of questions (or other types of prompts) for the purpose of gathering information from respondents through survey or statistical study. A research questionnaire is typically a mix of ...
or
schedule A schedule or a timetable, as a basic time-management tool, consists of a list of times at which possible tasks, events, or actions are intended to take place, or of a sequence of events in the chronological order in which such things are i ...
. Official or non-official agencies also collect and compile aggregate data on an ongoing basis through utilising infrastructures available within a department at the field level. Sources of aggregate data can also be regarded as tools for discovering data. In the US, some of the US data are presented in the form of tables. Examples of sources for these US aggregate data include the
United States Census Bureau The United States Census Bureau (USCB), officially the Bureau of the Census, is a principal agency of the U.S. Federal Statistical System, responsible for producing data about the American people and economy. The Census Bureau is part of the ...
, Statistical Abstract of the United States, and Social Explorer.
International Monetary Fund The International Monetary Fund (IMF) is a major financial agency of the United Nations, and an international financial institution, headquartered in Washington, D.C., consisting of 190 countries. Its stated mission is "working to foster globa ...
data, World DataBank, and
Penn World Table The Penn World Table (PWT) is a set of national-accounts data developed and maintained by scholars at the University of California, Davis and thGroningen Growth Development Centreof the University of Groningen to measure real GDP across countries a ...
are examples of transactional and international aggregate data sources.


Use of aggregate data


Comparative political analysis

Aggregate data is used in comparative political analysis because analysts do not only focus on individual’s behaviour. They also focus on the behaviour of areal units, including electoral constituencies and nations. In political activity analyses, significant data such as those related to
industrialisation Industrialisation ( alternatively spelled industrialization) is the period of social and economic change that transforms a human group from an agrarian society into an industrial society. This involves an extensive re-organisation of an econo ...
,
urbanization Urbanization (or urbanisation) refers to the population shift from rural to urban areas, the corresponding decrease in the proportion of people living in rural areas, and the ways in which societies adapt to this change. It is predominantly t ...
, as well as mass communication networks, are not expressed readily in individual levels. They are expressed in
per capita ''Per capita'' is a Latin phrase literally meaning "by heads" or "for each head", and idiomatically used to mean "per person". The term is used in a wide variety of social sciences and statistical research contexts, including government statistic ...
terms in order to control for the variations in the areal units’
population size In population genetics and population ecology, population size (usually denoted ''N'') is the number of individual organisms in a population. Population size is directly associated with amount of genetic drift, and is the underlying cause of effect ...
. Aggregate data are widely available because demographic, socio-economic, and political data are collected and published by the nations. This facilitates researchers and analysts in carrying out longer trend studies and allows them to bring changes and developments in a deeper focus.


APD scientific meta-analyses

Factors including the need for time, considerable resources and wide international cooperation, impeded the use of individual patient data (IPD)
meta-analysis A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analyses can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting me ...
, which led to most of the published meta-analyses relying upon aggregate patient data (APD). To acquire data in all trials on all patients, aggregate patient data are collected from completed studies being presented at professional meetings, published in the
medical literature Medical literature is the scientific literature of medicine: articles in journals and texts in books devoted to the field of medicine. Many references to the medical literature include the health care literature generally, including that of denti ...
, or were directly supplied by individual investigators. The aggregated patient data are utilised by users including the Cochrane Collaboration, the
United States Preventive Services Task Force The United States Preventive Services Task Force (USPSTF) is "an independent panel of experts in primary care and prevention that systematically reviews the evidence of effectiveness and develops recommendations for clinical preventive services". ...
, and multiple professional societies in providing support for clinical practice guidelines. Aggregate patient data are also used in time-to-event studies of meta-analyses as the results can inform investors about the worthiness to proceed to conducting more meta-analyses that are based on resource-intensive individual patient data.


Other uses


Health care

In a health information system, aggregate data is the
integration Integration may refer to: Biology *Multisensory integration *Path integration * Pre-integration complex, viral genetic material used to insert a viral genome into a host genome *DNA integration, by means of site-specific recombinase technology, ...
of data concerning numerous patients. A particular patient cannot be traced based on aggregate data. These aggregated data are only counts, including Tuberculous,
Malaria Malaria is a mosquito-borne infectious disease that affects humans and other animals. Malaria causes symptoms that typically include fever, tiredness, vomiting, and headaches. In severe cases, it can cause jaundice, seizures, coma, or death. S ...
, or other diseases.
Health facilities A health facility is, in general, any location where healthcare is provided. Health facilities range from small clinics and doctor's offices to urgent care centers and large hospitals with elaborate emergency rooms and trauma centers. The nu ...
use this type of aggregated statistics to generate reports and indicators, and to undertake strategic planning in their health systems. Compared with aggregated data, patient data are individual data related to a single patient, including one’s name, age,
diagnosis Diagnosis is the identification of the nature and cause of a certain phenomenon. Diagnosis is used in many different disciplines, with variations in the use of logic, analytics, and experience, to determine " cause and effect". In systems engin ...
and medical history. Patient-based data are mainly used to track the progress of a patient, such as how the patient responds to particular treatment, over time. The COVID-19 Data Archive, also called the COVID-ARC, aggregates data from studies around the
globe A globe is a spherical model of Earth, of some other celestial body, or of the celestial sphere. Globes serve purposes similar to maps, but unlike maps, they do not distort the surface that they portray except to scale it down. A model globe ...
. Researchers are able to have access towards the discoveries of international colleagues and forges collaborations to facilitate processes involved in fighting against the disease. Specifically, using aggregated healthcare data allows health care providers to unbolt actionable clinical insights when for instance, thorough views of clinical data or continuous patient records become possible.


Education

Aggregate data such as aggregate school-level demographic data and aggregate school-level achievement data are used in experimental analysis to assess the relationships between student achievement and school-level interventions. Aggregate data can also be used in non-experimental analysis such as regression discontinuity analysis and interrupted time-series analysis. Individual-level data are not required in these non-experimental analyses. For example, interrupted time-series analysis estimates the impact brought by a school-level program through comparing a school’s achievement before and after the program is launched where individual-level data are not necessary.


Limitations

During the process of averaging units within some cluster or within a country, information is lost which increases the probability of drawing inaccurate inferences. Information loss occurs because aggregation of data ignores individual variation as if it were only a type of statistical noise or measurement error. Inference also vary from one to another when either individual firm data or aggregated data is used for analysis. For instance, calculation of country averages does not account for firm-specific variables, such as firm size, firm age, or firm-ownership concentration, but calculation of individual averages does. Differences exist between results generated from aggregate data and individual data. There is also a problem of ‘ecological fallacy’. The concept was brought about by Robinson (1950). The meaning of the term is that the variability around the individual-level means is significantly different from the variability encompassing the aggregate means. With the aggregate concept, things other than the individual equivalents of aggregate data are expressed, which means that individual-level conclusions cannot be drawn. Although aggregate data has wider applicability than individual-level data, it is more challenging for researchers to tackle with analysis on
subgroup In group theory, a branch of mathematics, given a group ''G'' under a binary operation ∗, a subset ''H'' of ''G'' is called a subgroup of ''G'' if ''H'' also forms a group under the operation ∗. More precisely, ''H'' is a subgroup ...
results when aggregate data is used. Eventually, individual information may also be required. Growth modelling and
longitudinal Longitudinal is a geometric term of location which may refer to: * Longitude ** Line of longitude, also called a meridian * Longitudinal engine, an internal combustion engine in which the crankshaft is oriented along the long axis of the vehicl ...
modelling based on aggregate data are also difficult because variables can vary over time.


Other types of aggregate data


Financial aggregates data

Financial aggregates data is a type of aggregate data about
credit Credit (from Latin verb ''credit'', meaning "one believes") is the trust which allows one party to provide money or resources to another party wherein the second party does not reimburse the first party immediately (thereby generating a debt) ...
and the
money supply In macroeconomics, the money supply (or money stock) refers to the total volume of currency held by the public at a particular point in time. There are several ways to define "money", but standard measures usually include Circulation (curren ...
in Australia, which is utilised by policymakers in evaluating both the households and the companies’ economic and financial activities.


Credit aggregates

Credit aggregates are measurements of the households and businesses’ borrowings from financial intermediaries. The amount of funds borrowed by businesses for purposes including project investments, assets purchases, or cash flow managements are also measured using credit aggregates.


Monetary aggregates

Monetary aggregates are measurements of the money or ‘money-like’ instruments of the banking system, which is owed to businesses and households. An example of a ‘money-like’ instrument is deposits in the
bank account A bank account is a financial account maintained by a bank or other financial institution in which the financial transactions between the bank and a customer are recorded. Each financial institution sets the terms and conditions for each type of ...
.


Census aggregate data

In the UK,
census A census is the procedure of systematically acquiring, recording and calculating information about the members of a given population. This term is used mostly in connection with national population and housing censuses; other common censuses incl ...
aggregate data are data generated as outputs from the United Kingdom censuses. They provide information about the socio-economic and demographic characteristics of the country’s population. They are a compilation of aggregated, or summarised, calculations of the number of individuals, household residents, or families in particular geographic areas with specific characteristics, or compounds of characteristics, taken from the subjects of people and places, populations, families, health, ethnicity and religion, housing and work. Aggregate data are used as components of the UK censuses’ outputs. They are obtained from analysis on the information given in the census returns. The census aggregate data are used to compare and describe population characteristics across various locations in the UK because they are able to provide comparable information at a range of geographical levels over the entire UK. Census aggregate data are also utilised in the academic sector for teaching and research purposes, as well as for site location and marketing in the private sector.


References

Statistical data types Summary statistics {{Software-stub